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Abstract 

This paper develops a general approach, rooted in statistical learning theory, to learning an approx¬ 
imately revenue-maximizing auction from data. We introduce t-level auctions to interpolate between 
simple auctions, such as welfare maximization with reserve prices, and optimal auctions, thereby bal¬ 
ancing the competing demands of expressivity and simplicity. We prove that such auctions have small 
representation error, in the sense that for every product distribution F over bidders’ valuations, there 
exists a t-level auction with small t and expected revenue close to optimal. We show that the set of 
t-level auctions has modest pseudo-dimension (for polynomial t) and therefore leads to small learning 
error. One consequence of our results is that, in arbitrary single-parameter settings, one can learn a 
mechanism with expected revenue arbitrarily close to optimal from a polynomial number of samples. 


1 Introduction 

In the traditional economic approach to identifying a revenue-maximizing auction, one first posits a prior 
distribution over all unknown information, and then solves for the auction that maximizes expected revenue 
with respect to this distribution. The first obstacle to making this approach operational is the difficulty of 
formulating an appropriate prior. The second obstacle is that, even if an appropriate prior distribution is 
available, the corresponding optimal auction can be far too complex and unintuitive for practical use. This 
motivates the goal of identifying auctions that are “simple” and yet nearly-optimal in terms of expected 
revenue. 

In this paper, we apply tools from learning theory to address both of these challenges. In our model, 
we assume that bidders’ valuations (i.e., “willingness to pay”) are drawn from an unknown distribution F. 
A learning algorithm is given i.i.d. samples from F. For example, these could represent the outcomes of 
comparable transactions that were observed in the past. The learning algorithm suggests an auction to use 
for future bidders, and its performance is measured by comparing the expected revenue of its output auction 
to that earned by the optimal auction for the distribution F. 

The possible outputs of the learning algorithm correspond to some set C of auctions. We view C as 
a design parameter that can be selected by a seller, along with the learning algorithm. A central goal of 
this work is to identify classes C that balance representation error (the amount of revenue sacrificed by 
restricting to auctions in C) with learning error (the generalization error incurred by learning over C from 
samples). That is, we seek a set C that is rich enough to contain an auction that closely approximates an 
optimal auction (whatever F might be), yet simple enough that the best auction in C can be learned from 
a small amount of data. Learning theory offers tools both for rigorously defining the “simplicity” of a set 
C of auctions, through well-known complexity measures such as the pseudo-dimension, and for quantifying 
the amount of data necessary to identify the approximately best auction from C. Our goal of learning a 
near-optimal auction also requires understanding the representation error of different classes C; this task is 
problem-specific, and we develop the necessary arguments in this paper. 
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1.1 Our Contributions 


The primary contributions of this paper are the following. First, we show that well-known concepts from 
statistical learning theory can be directly applied to reason about learning from data an approximately 
revenue-maximizing auction. Precisely, for a set C of auctions and an arbitrary unknown distribution F 
over valuations in [l,il], 0(-^dclog-^) samples from F are enough to learn (up to a 1 — e factor) the 
best auction in C, where dc denotes the pseudo-dimension of the set C (defined in Section 2). Second, we 
introduce the class of t-level auctions, to interpolate smoothly between simple auctions, such as welfare 
maximization subject to individualized reserve prices (when t = 1), and the complex auctions that can arise 
as optimal auctions (as t —>■ oo). Third, we prove that in quite general auction settings with n bidders, 
the pseudo-dimension of the set of t-level auctions is O {nt log nt). Fourth, we quantify the number t of 
levels required for the set of t-level auctions to have low representation error, with respect to the optimal 
auctions that arise from arbitrary product distributions F. For example, for single-item auctions and several 
generalizations thereof, if t = n{^), then for every product distribution F there exists a t-level auction with 
expected revenue at least 1 — e times that of the optimal auction for F. 

In the above sense, the “t” in t-level auctions is a tunable “sweet spot”, allowing a designer to balance 
the competing demands of expressivity (to achieve near-optimality) and simplicity (to achieve learnability). 
For example, given a fixed amount of past data, our results indicate how much auction complexity (in the 
form of the number of levels t) one can employ without risking overfitting the auction to the data. 

Alternatively, given a target approximation factor 1 — e, our results give sufficient conditions on t and 
consequently on the number of samples needed to achieve this approximation factor. The resulting sample 
complexity upper bound has polynomial dependence on FI, e~^, and the number n of bidders. Known 
results [1, 8] imply that any method of learning a (1 —e)-approximate auction from samples must have sample 
complexity with polynomial dependence on all three of these parameters, even for single-item auctions. 

1.2 Related Work 

The present work shares much of its spirit and high-level goals with Balcan et al. [4], who proposed applying 
statistical learning theory to the design of near-optimal auctions. The first-order difference between the 
two works is that our work assumes bidders’ valuations are drawn from an unknown distribution, while 
Balcan et al. [4] study the more demanding “prior-free” setting. Since no auction can achieve near-optimal 
revenue ex-post, Balcan et al. [4] define their revenue benchmark with respect to a set Q of auctions on each 
input V as the maximum revenue obtained by any auction of Q on v. The idea of learning from samples 
enters the work of Balcan et al. [4] through the internal randomness of their partitioning of bidders, rather 
than through an exogenous distribution over inputs (as in this work). Both our work and theirs requires 
polynomial dependence on H,^: ours in terms of a necessary number of samples, and theirs in terms of 
a necessary number of bidders; as well as a measure of the complexity of the class Q (in our case, the 
pseudo-dimension, and in theirs, an analagous measure). The primary improvement of our work over of the 
results in Balcan et al. [4] is that our results apply for single item-auctions, matroid feasibility, and arbitrary 
single-parameter settings (see Section 2 for definitions); while their results apply only to single-parameter 
settings of unlimited supply.^ We also view as a feature the fact that our sample complexity upper bounds 
can be deduced directly from well-known results in learning theory — we can focus instead on the non-trivial 
and problem-specific work of bounding the pseudo-dimension and representation error of well-chosen auction 
classes. 

Elkind [12] also considers a similar model to ours, but only for the special case of single-item auctions. 
While her proposed auction format is similar to ours, our results cover the far more general case of arbitrary 
single-parameter settings and and non-finite support distributions; our sample complexity bounds are also 
better even in the case of a single-item auction (linear rather than quadratic dependence on the number of 
bidders). On the other hand, the learning algorithm in [12] (for single-item auctions) is computationally 
efficient, while ours is not. 

'^See Balcan et al. [3] for an extension to the case of a large finite supply. 
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Cole and Roughgarden [8] study single-item auctions with n bidders with valuations drawn from inde¬ 
pendent (not necessarily identical) “regular” distributions (see Section 2), and prove upper and lower bounds 
(polynomial in n and e~^) on the sample complexity of learning a (1 — e)-approximate auction. While the 
formalism in their work is inspired by learning theory, no formal connections are offered; in particular, both 
their upper and lower bounds were proved from scratch. Our positive results include single-item auctions 
as a very special case and, for bounded or MHR valuations, our sample complexity upper bounds are much 
better than those in Cole and Roughgarden [8]. 

Huang et al. [15] consider learning the optimal price from samples when there is a single buyer and a 
single seller; this problem was also studied implicitly in [10]. Our general positive results obviously cover the 
bounded-valuation and MHR settings in [15], though the specialized analysis in [15] yields better (indeed, 
almost optimal) sample complexity bounds, as a function of and/or H. 

Medina and Mohri [17] show how to use a combination of the pseudo-dimension and Rademacher com¬ 
plexity to measure the sample complexity of selecting a single reserve price for the VCG mechanism to 
optimize revenue. In our notation, this corresponds to analyzing a single set C of auctions (VCG with a 
reserve). Medina and Mohri [17] do not address the expressivity vs. simplicity trade-off that is central to 
this paper. 

Dughmi et al. [11] also study the sample complexity of learning good auctions, but their main results are 
negative (exponential sample complexity), for the difficult scenario of multi-parameter settings. (All settings 
in this paper are single-parameter.) 

Our work on t-level auctions also contributes to the literature on simple approximately revenue-maximizing 
auctions (e.g., [6, 14, 7, 9, 21, 24, 2]). Here, one takes the perspective of a seller who knows the valuation 
distribution F but is bound by a “simplicity constraint” on the auction deployed, thereby ruling out the 
optimal auction. Our results that bound the representation error of t-level auctions (Theorems 3.4, 4.1, 5.4, 
and 6.2) can be interpreted as a principled way to trade oft the simplicity of an auction with its approxima¬ 
tion guarantee. While previous work in this literature generally left the term “simple” safely undefined, this 
paper effectively proposes the pseudo-dimension of an auction class as a rigorous and quantifiable simplicity 
measure. 


2 Preliminaries 

This section reviews useful terminology and notation standard in Bayesian auction design and learning 
theory. 

Bayesian Auction Design We consider single-parameter settings with n bidders. This means that each 
bidder has a single unknown parameter, its valuation or willingness to pay for “winning.” (Every bidder 
has value 0 for losing.) A setting is specified by a collection X of subsets of {1, 2,..., n}; each such subset 
represent a collection of bidders that can simultaneously “win.” For example, in a setting with k copies of 
an item, where no bidder wants more than one copy, X would be all subsets of {1, 2,..., n} of cardinality at 
most k. 

A generalization of this case, studied in Section 5, is matroid settings. These satisfy: (i) whenever X £ X 
and y C A, y € A; and (ii) for two sets |/i| < I/ 2 I, hjh G X, there is always an augmenting element 
*2 G hXli such that/i U{ 12 } G X, X. Section 6 also considers arbitrary single-parameter settings, where the 
only assumption is that 0 G A. To ease comprehension, we often illustrate our main ideas using single-item 
auctions (where A is the singletons and the empty set). 

We assume bidders’ valuations are drawn from the continuous joint cumulative distribution F. Except in 
the extension in Section 4, we assume that the support of F is limited to [1,77]". As in most of optimal auction 
theory [18], we usually assume that F is a product distribution, with F = Fi x F 2 x ... x F„ and each Vi ^ Fi 
drawn independently but not identically. The virtual value of bidder i is denoted by = Vi — ■ 

A distribution satisfies the monotone-hazard rate (MHR) condition if fi{vi)/{l — Fi{vi)) is nondecreasing; 
intuitively, if its tails are no heavier than those of an exponential distribution. In a fundamental paper, [18] 
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proved that when every virtual valuation function is nondecreasing (the “regular” case), the auction that 
maximizes expected revenue for n Bayesian bidders chooses winners in a way which maximizes the sum of 
the virtual values of the winners. This auction is known as Myerson’s auction, which we refer to as At. 
The result can be extended to the general, “non-regular” case by replacing the virtual valuation functions 
by “ironed virtual valuation functions.” The details are well-understood but technical; see Myerson [18] and 
Hartline [13] for details. 

Sample Complexity, VC Dimension, and the Pseudo-Dimension This section reviews several 
well-known definitions from learning theory. Suppose there is some domain Q, and let c be some unknown 
target function c : Q —>■ {0,1}. Let 2? be an unknown distribution over Q. We wish to understand how 
many labeled samples (x,c{x)), x ^ V, are necessary and sufficient to be able to output a c which agrees 
with c almost everywhere with respect to T). The distribution-independent sample complexity of learning c 
depends fundamentally on the “complexity” of the set of binary functions C from which we are choosing c. 
We define the relevant complexity measure next. 

Let S' be a set of m samples from Q. The set S is said to be shattered by C if, for every subset T C S, 
there is some ct & C such that ct{x) = 1 ii x £ T and criy) = 0 it y ^ T. That is, ranging over all c G C 
induces all possible projections onto S. The VC dimension of C, denoted VC(C), is the size of the largest 
set S that can be shattered by C. 

Let err 5 (c) = ~c(2^)l)/l‘^l denote the empirical error of c on S, and let err(c) = Ea;^£i[|c(a;) — 

c(a:)|] denote the true expected error of c with respect to T). A key result from learning theory [23] is: 
for every distribution V, a sample S of size H(e“^(VC(C) -I- In |)) is sufficient to guarantee that err 5 (c) G 
[err(c) — e, err(c) -I- e] for every c £ C with probability 1 — 5. In this case, the error on the sample is close 
to the true error, simultaneously for every hypothesis in C. In particular, choosing the hypothesis with the 
minimum sample error minimizes the true error, up to 2e. We say C is (e, S)-uniformly learnable with sample 
complexity m if, given a sample S of size m, with probability 1 — 5, for all c £ C, |err 5 (c) — err(c)| < e: 
thus, any class C is (e, 5)-uniformly learnable with m = 0 (VC(C) -|- In y)) samples. Conversely, for every 
learning algorithm A that uses fewer than samples, there exists a distribution 22' and a constant q 

such that, with probability at least q, A outputs a hypothesis c' £ C with err(c') > err(c) -I- | for some c£ C. 
That is, the true error of the output hypothesis is more than | larger the best hypothesis in the class. 

To learn real-valued functions, we need a generalization of VC dimension (which concerns binary func¬ 
tions). The pseudo-dimension [19] does exactly this. Formally, let c : Q —>■ [0,22] be a real-valued function 
over Q, and C the class we are learning over. Let S' be a sample drawn from 22, |S| = m, labeled according 
to c. Both the empirical and true error of a hypothesis c are defined as before, though \c{x) — c(x)\ can 
now take on values in [0,22] rather than in {0,1}. Let (r^,... ,r™) G [0,22]'" be a set of targets for S. We 
say (r^,... ,r'") witnesses the shattering of S by C if, for each T C S, there exists some ct £ C such that 
> A for all x* G T and ct{x'^) < A for all a;' ^ T. If there exists some r witnessing the shattering 
of S, we say S is shatterable by C. The pseudo-dimension of C, denoted dc, is the size of the largest set S 
which is shatterable by C. The sample complexity upper bounds of this paper are derived from the follow¬ 
ing theorem, which states that the distribution-independent sample complexity of learning over a class of 
real-valued functions C is governed by the class’s pseudo-dimension. 

Theorem 2.1 [E.g. [Ij] Suppose C is a class of real-valued functions with range in [0,22] and pseudo¬ 
dimension dc- For every e > 0, 5 G [0,1], the sample complexity of {e, 5)-uniformly learning f with respect 
toC ism = 0 ((f)" {dc In (f) + In (i))) . 

Moreover, the guarantee in Theorem 2.1 is realized by the learning algorithm that simply outputs the function 
c G C with the smallest empirical error on the sample. 
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Applying Pseudo-Dimension to Auction Classes For the remainder of this paper, we consider classes 
of truthful auctions C.^ When we discuss some auction c gC, we treat c : [0, i/]” —>■ R as the function that 
maps (truthful) bid tuples to the revenue achieved on them by the auction c. Then, rather than minimizing 
error, we aim to maximize revenue. In our setting, the guarantee of Theorem 2.1 directly implies that, with 
probability at least 1 — (5 (over the m samples), the output of the empirical revenue maximization learning 
algorithm — which returns the auction c G C with the highest average revenue on the samples — chooses 
an auction with expected revenue (over the true underlying distribution F) that is within an additive e of 
the maximum possible. 


3 Single-Item Auctions 

To illustrate out ideas, we first focus on single-item auctions. The results of this section are generalized 
signihcantly in Sections 5 and 6. Section 3.1 defines the class of <-level single-item auctions, gives an 
example, and interprets the auctions as approximations to virtual welfare maximizers. Section 3.2 proves 
that the pseudo-dimension of the set of such auctions is O{nt log nt), which by Theorem 2.1 implies a 
sample-complexity upper bound. Section 3.3 proves that taking t = yields low representation error. 

3.1 t-Level Auctions: The Single-Item Case 

We now introduce t-level auctions, or Ct for short. Intuitively, one can think of each bidder as facing one of 
t possible prices; the price they face depends upon the values of the other bidders. Consider, for each bidder 
i, t numbers 0 < < iip < ... < We refer to these t numbers as thresholds. This set of tn numbers 

dehnes a t-level auction with the following allocation rule. Consider a valuation tuple v: 

1. For each bidder i, let ti{vi) denote the index r of the largest threshold £i^r that lower bounds Vi (or -1 
if Vi < iip). We call ti{vi) the level of bidder i. 

2. Sort the bidders from highest level to lowest level and, within a level, use a fixed lexicographical 
tie-breaking ordering to pick the winner.^ 

3. Award the item to the first bidder in this sorted order (unless ti = —1 for every bidder i, in which case 
there is no sale). 

The payment rule is the unique one that renders truthful bidding a dominant strategy and charges 0 to 
losing bidders — that is, the winning bidder pays the lowest bid at which she would continue to win. It 
is important for us to understand this payment rule in detail; there are three interesting cases. Suppose 
bidder i is the winner. In the first case, i is the only bidder who might be allocated the item (other bidders 
have level -1), in which case her bid must be at least her lowest threshold. In the second case, there are 

multiple bidders at her level, so she must bid high enough to be at her level (and, since ties are broken 

lexicographically, this is her threshold to win). In the final case, she need not compete at her level: she can 
choose to either pay one level above her competition (in which case her position in the tie-breaking ordering 
does not matter) or she can bid at the same level as her highest-level competitors (in which case she only 
wins if she dominates all of those bidders at the next-highest level according to ;^). Formally, the payment 
p of the winner i (if any) is as follows. Let f denote the highest level r such that there at least two bidders 
at or above level r, and I be the set of bidders other than i whose level is at least f. 

Monop If f = — 1, then pi = ii^ (she is the only potential winner, but must have level > 0 to win). 

Mult If ti{vi) = f then pi = (she needs to be at level f). 

Unique If ti{vi) > f, ii i i' for all i' G /, she pays pi = otherwise she pays pi = £i^f+i (she either needs 
to be at level f -I- 1, in which case her position in does not matter, or at level f, in which case she 
would need to be the highest according to :^). 

^An auction is truthful if truthful bidding is a dominant strategy for every bidder. That is: for every bidder i, and all 
possible bids by the other bidders, i maximizes its expected utility (value minus price paid) by bidding its true value. In the 
single-parameter settings that we study, the expected revenue of the optimal non-truthful auction (measured at a Bayes-Nash 
equilibrium with respect to the prior distribution) is no larger than that of the optimal truthful auction. 

^Our results hold also for some other tie-breaking rules. 
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We now describe a particular t-level auction, and demonstrate each case of the payment rule. 

Example 3.1 Consider the following 4-level auction for bidders a, b, c. Let (-a,- = [2,4,6, 8],£b,- = [1-5, 5,9,10], 
and t'c,. = [1.7,3.9,6, 7]. For example, if bidder a bids less than 2 she is at level —1, a bid in [2,4) puts her 
at level 0, a bid in [4,6) at level 1, a bid in [6,8) at level 2, and a bid of at least 8 at level 3. Let a>- by c. 
Monop If Va = 3, U{, < 1.5, Vc < 1.7, then 6, c are at level —1 (to which the item is never allocated). So, a wins 
and pays 2, the minimum she needs to bid to be at level 0. 

Mult If Va > 8,Vb > 10, Vc < 7, then a and b are both at level 3, and a y b, so a will win and pays 8 (the 
minimum she needs to bid to be at level 3). 

Unique If Va > 8,Vb € [5, 9], Uc G [3.9, 6], then a is at level 3, and b and c are at level 1. Since a y b and aye, 
a need only pay 4 (enough to be at level 1). If, on the other hand, Va G [4, 6],Uh = [5,9] and Vc > 6, c 
has level at least 2 (while a, b have level 1), but c needs to pay 6 since a,by c. 

Remark 3.2 (Connection to virtual valuation functions) t-level auctions are naturally interpreted as 
discrete approximations to virtual welfare maximizers, and our representation error bound in Theorem 3.4 
makes this precise. Each level corresponds to a constraint of the form “If any bidder has level at least r, 
do not sell to any bidder with level less than r.” We can interpret the (i^s (with fixed r, ranging over 
bidders i) as the bidder values that map to some common virtual value. For example, 1-level auctions treat 
all values below the single threshold as having negative virtual value, and above the threshold uses values 
as proxies for virtual values. 2-level auctions use the second threshold to the refine virtual value estimates, 
and so on. With this interpretation, it is intuitively clear that as t —>■ oo, it is possible to estimate bidders’ 
virtual valuation functions and thus approximate the optimal auction to arbitrary accuracy. 

3.2 The Pseudo-Dimension of t-Level Auctions 

This section shows that the pseudo-dimension of the class of t-level single-item auctions with n bidders is 
0{nt log nt). Combining this with Theorem 2.1 immediately yields sample complexity bounds (parameterized 
by t) for learning the best such auction from samples. 

Theorem 3.3 For a fixed tie-breaking order y, the pseudo-dimension of the set of n-bidder single-item 
t-level auctions is O {ntlog{nt)). 

Proof: Recall from Section 2 that we need to upper bound the size of every set that is shatterable using 
t-level auctions. Fix a set of samples S = (v^,..., v™) of size m and a potential witness R = (r^,..., r'"). 
Each auction c induces a binary labeling of the samples v-^ of S (whether c’s revenue on v-’ is at least F or 
strictly less than F). The set S is shattered with witness R if and only if the number of distinct labelings 
of S given by any t-level auction is 2™. 

We upper-bound the number of distinct labelings of S given by t-level auctions (for some fixed potential 
witness R), counting the labelings in two stages. Note that S involves nm numbers — one value v^ for each 
bidder for each sample. A t-level auction involves nt numbers — t thresholds for each bidder. Call two 
t-level auctions with thresholds and equivalent if: 

1. The relative order of the ii^s agrees with that of the ii^s, in that both induce the same permutation 
of{l,2,...,n}x{0,I,...,t-I}. 

2. Merging the sorted list of the vf’s with the sorted list of the ii^s yields the same partition of the vf’s 
as does merging it with the sorted list of the i'i.r’s. 

Note that this is an equivalence relation. If two t-level auctions are equivalent, every comparison between 
two numbers (valuations or thresholds) is resolved identically by those auctions. Using the defining properties 
of equivalence, a crude upper bound on the number of equivalence classes is 

(nt)! • ( ^ j < (nm-I-nt)^ . (I) 
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We now upper-bound the number of distinct labelings of S that can be generated by any auction in a single 
equivalence class C. First, as all comparisons between two numbers (valuations or thresholds) are resolved 
identically for all auctions in C, each bidder i in each sample of S is assigned the same level (across 
auctions in C), and the winner (if any) in each sample v-' is constant across all of C. By the same reasoning, 
the identity of the parameter that gives the winner’s payment (some is uniquely determined by pairwise 
comparisons (recall Section 3.1) and hence is common across all auctions in C. The payments however, 
can vary across auctions in the equivalence class. 

For a bidder i and level r G {0,1, 2,.. ., t — 1}, let Si^r'^S be the subset of samples in which bidder i wins 
and pays £i^r- The revenue obtained by each auction in C on a sample of Si^r is simply £i^r (and independent 
of all other parameters of the auction). Thus, ranging over all t-level auctions in C generates at most 
distinct binary labelings of Si^r — the possible subsets of Si^r for which an auction meets the corresponding 
target form a nested collection. 

Summarizing, within the equivalence class C of t-level auctions, varying a parameter £i^r generates at 
most different labelings of the samples Si^r and has no effect on the other samples. Since the subsets 
{S'i.rli.T are disjoint, varying all of the (i.e., ranging over C) generates at most 

n t— 1 

( 2 ) 

i—l r—0 

distinct labelings of S. 

Combining (1) and (2), the class of all t-level auctions produces at most {nm + distinct labelings 

of S. Since shattering S requires 2™ distinct labelings, we conclude that 2™ < {nm + implying 

m = 0{nt\ognt) as claimed. ■ 

3.3 The Representation Error of Single-Item t-Level Auctions 

In this section, we show that for every bounded product distribution, there exists a t-level auction with 
expected revenue close to that of the optimal single-item auction when bidders are independent and bounded. 
The analysis “rounds” an optimal auction to a t-level auction without losing much expected revenue. This 
is done using thresholds to approximate each bidder’s virtual value: the lowest threshold at the bidder’s 
monopoly reserve price, the next ^ thresholds at the values at which bidder i’s virtual value surpasses 
multiples of e, and the remaining thresholds at those values where bidder Fs virtual value reaches powers of 
1 -|- e. Theorem 3.4 formalizes this intuition. 

Theorem 3.4 Suppose F is product distribution over Ift = Vl{^ -I-log]^_i_£ iF), then Ct contains a 

single-item auction with expected revenue at least 1 — e times the optimal expected revenue. 

With an eye toward our generalizations, we prove the following more general result. Theorem 3.4 follows 
immediately by taking 0 = 7 = !. 

Lemma 3.5 Consider n bidders with valuations in [0,iF] and with P[maxi rij > a] > 7 . Then, Ct contains 
a single-item auction with expected revenue at least a 1 — e times that of an optimal auction, for t = 
0 (^ + logi+. f) • 

Proof: Consider a fixed bidder i. We define t thresholds for i, bucketing i by her virtual value, and prove 
that the t-level auction A using these thresholds for each bidder closely approximates the expected revenue 
of the optimal auction A4. Let e' be a parameter defined later. 

Set £ifi = (j)~^{0), bidder i’s monopoly reserve."* For r G [1, |":^]], let £i^r = ‘ aye') {(ft G [0,1]). 

For T G -b [logi+i f 1], let £i,r = (l)~\a{l + f)^”^wl) {(f, > 1). 

‘^Recall from Section 2 that denotes the virtual valuation function of bidder i. For the non-regular case, 4>i denotes the 
ironed virtual valuation functions. It is convenient to assume that these functions are strictly increasing (not just nondecreasing); 
this can be enforced at the cost of losing an arbitrarily small amount of revenue. 
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Consider a fixed valuation profile v. Let i* denote the winner according to A, and i the winner according 
to the optimal auction A4. If there is no winner, we interpret and 4'ii'^i') 0- Recall that Ai 

always awards the item to a bidder with the highest positive virtual value (or no one, if no such bidders 
exist). The definition of the thresholds immediately implies the following. 

1 . A only allocates to non-negative ironed virtual-valued bidders. 

2. If there is no tie (that is, there is a unique bidder at the highest level), then i = i*. 

3. When there is a tie at level r, the virtual value of the winner of A is close to that of Ai: 


If G [0: T:^!] then (wy) - (t>i-(vi-) < aye'; 

ifTG[ry^i,r-^i + riog 


1+1 


Hi] 


These facts imply that 


. («/) — 


> 1 - 


Ev[Rev(^)] = Ev[^i*(uj.)] > (I - f) • Ev[(()y (uy)] - aye' = (I - |) • Ev[Rev(7W)] - aye'. (3) 

where the first and final equality follow from A and Ai's allocations depending on ironed virtual values, not 
on the values themselves, thus, the ironed virtual values are equal in expectation to the unironed virtual 
values, thus the revenue, of the mechanisms (see [13], Chapter 3.5 for discussion). 

The assumption that P[maxi Wi > a] > y implies (since a feasible auction prices the good at a and awards 
it to any bidder with value at least a, if any) that E[Rev(AI)] > ay. Combining this with (3), and setting 
e' = I implies Ev[Rev(^)] > (1 — e)Ev[Rev(AI)]. ■ 

Combining Theorems 2.1 and 3.4 yields the following Corollary 3.6. 


Corollary 3.6 Let F be a product distribution with all bidders’ valuations in [1,LI]. Assume that t = 
0 (i -I- logi_|_g iL) and m = O (nt log {nt) log A log = O ■ Then with probability at least 

1 — d, the single-item empirical revenue maximizer of Ct on a set of m samples from F has expected revenue 
at least 1 — e times that of the optimal auction. 


4 Unbounded MHR Distributions 


This section shows how to replace the assumption of bounded valuations by the assumption that each 
valuation distribution satisfies the monotone hazard rate (MHR) condition, meaning that is non¬ 

decreasing. Our resulting sample complexity bounds depend on the number of bidders n and the error 
parameter e only, bounded case, following ideas from This extension is based on previous work [5] that 
effectively reduces the case of MHR valuations to the case of valuations lying in the interval [/3e, 2/3 log i] 
for a suitable choice of /3. Our analysis works with rj-truncated t—level auctions, where each t-level auction 
/ is replaced with = min(/, rj). 


Theorem 4.1 Suppose F is a product distribution and each bidder’s valuation distribution satisfies the MHR 
condition. Then, for each e > 0, and each /3 > /3 such that P max^ Vi>^ >1—— e', there is a t-level 

log -truncated auction with expected revenue at least 1 — e times that of an optimal auction, where 

t = Q{T+ logi+,, (log i)) and e'= O . 

Before proving Theorem 4.1, we quote a key fact about MHR distributions Cai and Daskalakis [5]. 


Theorem 4.2 (Theorem 19 and Lemma 38 of [5]) Let Xi ,..., be a collection of independent ran¬ 
dom variables whose distributions satisfy the MHR condition. Then there exists an anchoring point /3 such 


that 


P[maxW > > 1 

i 2 


1 

7 ^’ 



and for all e > 0 , 

/ 2 :/max; Xi < 36/3elog -. 

J2/3 log i £ 

Now, we proceed to prove Theorem 4.1. 

Proof of Theorem 4-P Fix e', to be defined later. Conditioning on all bids being at most 2/3 log ^ allows 
us apply Lemma 3.5 as though the valuations are bounded. In particular, since P[maxi Vi > ^\ > 1 — 

— e', Lemma 3.5 implies for q; = ^,7 = 1— — e' and H = 2,5log implies the existence of a 

t = O + log^+j/ (log ^))-level truncated^ auction A such that: 

E Rev(^)| max Vi < 2/31og 4 > (1 — e')E Rev(Al)| maxvi < 2/31og ^ (4) 

i e \ [ i e 

Thus, we have 

E[Rev(^)]>E Rev(^)| max Vi < 2/31og ^ P max Vi < 2/3 log 4- 

L * £ J L * £ 

> (1 — e')E Rev(Al)| maxvi < 2/31og — P 

i e' 

> (1 — e')E [Rev(Al)] — 36/3e' log — 

> (^1 - O (^e' log ^ E [Rev(7W)] 

where the first inequality comes from the fact that A only sells to agents with non-negative virtual value 
(so the revenue on a smaller region of bids is only less), the second from Equation 4 and probabilities being 

1 _ L- ^ 

at most 1, the penultimate from Theorem 4.2, and the final from the fact that Rev(Al) > —Setting 
e = e' log ^ and noticing that this implies e' < yields the desired result. ■ 

Corollary 4.3 follows as a corollary of Theorems 2.1 and 4.1. 


max Vi < 2/3 log — 
i e 


Corollary 4.3 With prohahility 1 — S, the empirical revenue maximizer for a sample of size m of the class 


of t-level rj-truncated single-item auctions is a I — O (e)-approximation of the optimal auction for n MHR 
bidders, for t = O {jr + logi+e' (log ^)), e' = and 


m = O 



nt log {nf) In 4 -1- In i 
e 0 



where rj can he learned from the set of m samples. 


Proof: We first argue that one can learn some rj from the sample. Let Q{S,g) = —— *' |g| - and 

q{g) = P [maxi Vi > g] (the empirical and true probability that the maximum bid is at least g, respectively). 


Given e', consider a set of samples S'^/ of profiles, and compute the largest /3 such that (5(5'^/, ^) > 1 — — e'. 

Standard VC-bounds imply that \q{p) — Q{Se',p)\ < e' for all p with probability at least 1 — <3, provided 
I'S'e'l > 0 (^(^)^ In P■ In particular, with probability 1 — (5, we will have q(S^i, ^) > 1 — — e', so it will 
be the case that P > P, and also q{P) > 1 — ^ — 2e'. Then, let r/ = 2/3 log 


^Lemma 3.5 only implies the existence of such a t-level auction. However, when the bids are all below some rj, one can 
always find an 77 -truncated auction which is equivalent to each untruncated auction. 

®This can be thought of as a class of binary classifiers with VC-dimension one. 
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Now, Theorem 4.1 implies the existence of a ? 7 -truncated t-level auction which (1 — e')-approximates the 
optimal auction. The argument is completed using the fact that, if the auctions’ values are upper-bounded 
by ry, one can equivalently think of the values being upper-bounded by 77 , so Theorem 2.1 implies the sample 
complexity bound allowing additive error e'rj = ^ log ^. This error is multiplicatively at most e = e' log ^, 
since Rev(Al) = . ■ 

Remark 4.4 (Near-optimality of sample complexity) Can we do better than Theorem 3.3? Can we 
learn an approximately optimal auction from a simpler class — with pseudo-dimension poly (log H, log n,^), 
say — allowing for much smaller sample complexity than achieved here? The answer is negative: lower 
bounds in Cole and Roughgarden [ 8 ] imply that approximate revenue maximization requires sample com- 
plexityy at least linear in the number of bidder n, even when bidders’ valuations independently drawn from 
MHR distributions. Thus, every class of auctions that is sufficiently expressive to guarantee expected revenue 
at least 1 — e times optimal must have pseudo-dimension that grows polynomially with n. 


5 ^-Level Matroid Auctions 

This section extends the ideas and techniques from Section 3.2 to matroid environments. The straightforward 
generalization of t-level auctions to matroid environments suffices: we order the bidders by level, breaking 
ties within a level by some fixed linear ordering over agents , and greedily choose winners according to this 
ordering (subject to feasibility and to bids exceeding the lowest threshold). The next theorem bounds the 
pseudo-dimension of this more general class of auctions. 

Theorem 5.1 The pseudo-dimension of t-level matroid auetions with n bidders is 0{ntlog(nt)). 

The proof is conceptually similar to that of Theorem 3.3, though we require a more general argument. 
Our proof uses a couple of standard results from learning theory (see e.g. [16] for details). The first, also 
known as Sauer’s Lemma, states that the number of distinct projections of a set S induced by a set system 
with bounded VC dimension grows only polynomially in [S']. 

Lemma 5.2 Let C be a set of functions from Q to {0,1} with VC dimension d, and SCQ. Then 

[{S' n (a; G Q : c(a:) = 1} : c G C}| < \Sf. 

Recall that a linear separator in R'’* is defined by coefficients oi,..., Ud, and assigns x G R.'^ the value 1 
ifEti QiXi > 0 and the value 0 otherwise. 

Lemma 5.3 The set of linear separators in has VC dimension d-\-\. 

Proof of Theorem 5.1: Consider a set of samples S of size m which can be shattered by t-level matroid 
auctions with revenue targets (r^,..., r™). We upper-bound the number of labelings of S possible using 
t-level auctions, which again yields an upper bound on m. 

We partition auctions into equivalence classes, identically to the proof of Theorem 3.3. Recall that, across 
all auctions in an equivalence class, all comparisons between two thresholds or a threshold and a bid are 
resolved identically. Recall also that the number of equivalence classes is at most (nm -|- nt)^"‘. We now 
upper bound the number of distinct labelings any fixed equivalence class C of auctions can generate. 

Consider a class C of equivalent auctions. The allocation and payment rules are more complicated than 
in the single-item case but still relatively simple. In particular, whether or not a bidder wins depends only 
on the ordering of the bidders (by level) and the fixed tie-breaking rule and thus is a function only of 
comparisons between bids and thresholds. This implies that, for every sample in S, all auctions in the class C 
declare the same set of winners. It also implies that the payment of each winning bidder is a fixed threshold 
and the identity of this parameter is the same across all auctions in C. 
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Now, encode each auction A & C and sample as an nt + 1-dimensional vector as follows. Let 

equal the value of ^i^r in the auction A. Define = 1 for every A € C. Define yf ^ = 1 if bidder z is a 

winner paying her threshold for auctions in C and 0 otherwise. Finally, define = —A ■ The point 

is that, for every auction A in the class C and sample v-^, 

• y’ > 0 

if and only if Rev(^) > A. Thus, the number of distinct labelings of the samples generated by auctions in 
C is bounded above by the number of distinct sign patterns on m points in generated by all linear 

separators. (The y-’-vectors are constant across C and can be viewed as m fixed points in each 

auction A £ C corresponds to the vector x^ of coefficients.) Applying Lemmas 5.2 and 5.3, t-level matroid 
auctions can generate at most labelings per equivalence class, and hence at most (nm + 

distinct labelings in total. This imposes the restriction 

2"^ < (nm-f 

solving for m yields the desired bound. I 

We now extend our representation error bound for t-level single-item auctions to matroids. 

Theorem 5.4 Consider an arbitrary matroid environment. Suppose F is a production distribution with 
valuations in Provided t = D (-j-I-log]^_i_g iL), there exists a t-level matroid auction with expected 

revenue at least o 1 — e fraction of the optimal expected revenue. 

The key new idea in the proof is to exhibit a bijection between the feasible sets I* (our winning set) 
and I (the optimal winning set) such that each bidder from I* has a level at least as high as their bijective 
partner in I . To implement this, we use the following property of matroids (e.g. [14, 20]). 

Proposition 5.5 Let Opt denote the largest-weight set of a matroid, and let B be any other feasible set 
such that |il| = |Opt|, and OPTi,Bi denote the i-th largest element o/O pt and B, respectively. Then 
w{OPTi) > w{Bi) for all i. 


Proof of Theorem 5.4: Define bidders’ thresholds exactly as in the proof of Theorem 3.4 and let A denote the 
corresponding t-level auction. Fix an arbitrary valuation profile v. Let X* denote the set of winning bidders 
in A and X the set of winning bidders in A4. Recall that the latter is the feasible set that maximizes the 
sum of virtual valuations. Both sets are maximally independent amongst those bidders with non-negative 
virtual value {M, by virtual of being welfare-maximal, and A, by definition). Then, we claim |I*| = \X \ (if 
not, by the augmentation property of matroids, the smaller set could be extended to include an element of 
the larger while maintaining independence, violating their maximality). 

Notice that X* is lexicographically optimal with respect to the levels, rather than the exact weights. 
Proposition 5.5 implies that X is also lexicographically optimal with respect to the levels; thus, the level 
of the zth largest bidder in X has the same level as the zth largest bidder in X*. Then, by an accounting 
argument identical to the one for the single-item case (comparing virtual values for the zth bidder in X* to 
the zth bidder in X ) summing up over all bidders completes the proof. ■ 

Thus, we have the following corollary about the sample complexity of 1 — e-approximating Myerson in 
matroid environments with t-level auctions, noting that the maximum revenue is now nH rather than H. 


Corollary 5.6 With probability 1 — S, the empirical revenue maximizer for a sample of size m of the class 
of t-level single-item auctions is a 1 — O {e)-approximation to Myerson for n bidders whose valuations are in 
[l,iL],/ort = 0(i-flogi+, iL) and 


m = O 



. , Hn , r 

nt log(nt) In-1- In -- 

e 0 , 
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6 Single-parameter t-level auctions 

In this section, we show how to extend the ideas and techniques from Section 3.2 to any single-parameter 
environment which has the empty set as a feasible outcome. With this mild assumption, the results in this 
section do not require the environment to be a matroid or even downwards-closed. Before we state this 
result, we need a slight generalization of the t-level auction to this setting. Previously, no t-level auction 
would allocate to any bidder whose value was below their lowest threshold, and this will not be a possibility 
in environments which are not downwards-closed. Instead, in this setting, if any bidder fails to pass her 
lowest threshold, we will assume A will choose the empty outcome. Moreover, a t-level auction will now need 
more about what the various levels represent: previously, we implicitly used the fcth threshold to correspond 
to a value where each bidder’s virtual value would pass some quantity qk- 

In this general setting, we make that connection explicit. There will still be nt numbers which define a 
particular t-level auction, the t threshold locations per bidder. In addition, we will consider a fixed vector 
$ G K.* (not parameterizing the auction class) which, for all t, intuitively assigns an estimate of 
which is the same for all bidders i. Formally, will be used to assign a real value to a feasible set X G X 
with a valuation profile v as follows. Let ex = where ti{vi) as before is the level agent i’s bid 

according to v^. Then, a particular t-level auction will choose the winning set X G X which maximizes ex 
(breaking ties in some fixed way which does not depend upon the bids). If, for each bidder i and level r, the 
threshold ii^r is placed exactly at the value at which i’s virtual valuation surpasses then this auction is 
approximately optimizing the virtual surplus of the winning set. 

Theorem 6.1 The pseudo-dimension of t-level single-parameter auctions with n bidders is O {ntlog{nt)). 

These auctions have slightly more complicated payment rules than those for matroids, where each agent 
i was intuitively competing with (at most) one other bidder for inclusion in the winning set. Now, a bidder 
i who is in the winning set X will have a payment of the following form. For a fixed assignment of levels 
to bidders, sort the alternatives according to their values ey for all Y G X. Let Y be the highest-ranked 
alternative set which does not contain i. Then, i’s payment will be the threshold corresponding to the 
minimal t such that '^ii^x + ‘I’t > (namely, the minimal bid which keeps X 

preferred to Y in terms of the estimated virtual values)"^. While this rule is more complicated, it is still the 
case that, once each bidder is assigned to some level, each of the bidders in the winning set’s payment is just 
one of their thresholds. Thus, the proof of Theorem 6.1 is identical to the one of Theorem 5.1 without ties. 

When considering non-downwards-closed environments, the optimal revenue may be arbitrarily close to 
0, making it difficult to argue about multiplicative approximations to the optimal revenue. Instead, we 
will give a weaker guarantee, namely, that the empirical revenue maximizer will have expected revenue 
which is additively close to the optimal expected revenue. If one is willing to restrict the environment to 
be downwards-closed, it is possible to achieve a multiplicative guarantee, since in that case the optimal 
revenue is at least I. We now state the Theorem which bounds the representation error of t-level auctions 
for single-parameter environments. 

Theorem 6.2 There is a t-level auction whose expected revenue is within an additive e of optimal, in any 
single-parameter setting X such that % G X, for t = O for n bidders with valuations distributions 

which are product and bounded in [0, iL]. 

Proof: Let t = ^ —h ■^. We will begin by defining $, the t-dimensional vector corresponding to the 
estimated virtual values. Let $o = —Hn (if any bidder has virtual value < —Hn, the virtual value of any set 
containing her is negative, since virtual values are upper-bounded by values and the value of the remaining 
set may be at most H{n— 1), so in this case one should allocate to 0). Then, let <I>t- = -|- Thus, we 

partition the space of virtual values into additive sections of width 

^Since we assume ties are broken in a way which does not depend on the bids, we can ignore ties in the payment rule, and 
agents will only ever pay thresholds. 
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Then, for each bidder i and r, let the value at which bidder f’s virtual value surpasses 

$T-. Then, consider a valuation profile v on which Ai and this particular A disagree on the winning sets 
X*,I G X. Notice that each bidder Vs virtual value is estimated correctly within an additive ^ by ^ti{vi) 
(assuming no bidder has highly negative virtual value, in which case M and A both choose outcome 0), and 
are never overestimated. Thus, it is the case that 

i*Gi» i*ei* i'el' i'el' 

and the claim follows. ■ 

Thus, we have the following sample complexity result for general single-parameter settings. 


Corollary 6.3 With probability 1 — d, the empirical revenue maximizer on m samples S from the class 
of t-level auctions has true expected revenue within an additive e of Myerson’s expected revenue, for the 
single-parameter environment X when bidders have valuations in for t = O o,nd 


m = O 





Open Questions 

There are some significant opportunities for follow-up research. First, there is much to do on the design of 
computationally efficient (in addition to sample-efficient) algorithms for learning a near-optimal auction. The 
present work focuses on sample complexity, and our learning algorithms are generally not computationally 
efficient.® The general research agenda here is to identify auction classes C for various settings such that: 

1. C has low representation error; 

2. C has small pseudo-dimension; 

3. There is a polynomial-time algorithm to find an approximately revenue-maximizing auction from C on 
a given set of samples.® 

There are also interesting open questions on the statistical side, notably for multi-parameter problems. 
While the negative result in [11] rules out a universally good upper bound on the sample complexity of 
learning a near-optimal mechanism in multi-parameter settings, we suspect that positive results are possible 
for several interesting special cases. 
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